Distant-Talking Speech Recognition Based on Spectral Subtraction by Multi-Channel LMS Algorithm
نویسندگان
چکیده
We propose a blind dereverberation method based on spectral subtraction using a multi-channel least mean squares (MCLMS) algorithm for distant-talking speech recognition. In a distant-talking environment, the channel impulse response is longer than the short-term spectral analysis window. By treating the late reverberation as additive noise, a noise reduction technique based on spectral subtraction was proposed to estimate the power spectrum of the clean speech using power spectra of the distorted speech and the unknown impulse responses. To estimate the power spectra of the impulse responses, a variable step-size unconstrained MCLMS (VSS-UMCLMS) algorithm for identifying the impulse responses in a time domain is extended to a frequency domain. To reduce the effect of the estimation error of the channel impulse response, we normalize the early reverberation by cepstral mean normalization (CMN) instead of spectral subtraction using the estimated impulse response. Furthermore, our proposed method is combined with conventional delay-andsum beamforming. We conducted recognition experiments on a distorted speech signal simulated by convolving multi-channel impulse responses with clean speech. The proposed method achieved a relative error reduction rate of 22.4% in relation to conventional CMN. By combining the proposed method with beamforming, a relative error reduction rate of 24.5% in relation to the conventional CMN with beamforming was achieved using only an isolated word (with duration of about 0.6 s) to estimate the spectrum of the impulse response. key words: distant-talking speech recognition, blind dereverberation, multi-channel least mean squares, spectral subtraction, cepstral mean normalization
منابع مشابه
Dereverberation Based on Spectral Subtraction by Multi-Channel LMS Algorithm for Hands-Free Speech Recognition
In a distant-talking environment, channel distortion drastically degrades speech recognition performance because of a mismatch between the training and testing environments. The current approach focusing on automatic speech recognition (ASR) robustness to reverberation and noise can be classified as speech signal processing [1, 4, 5, 14], robust feature extraction [10, 20], and model adaptation...
متن کاملBlind dereverberation based on CMN and spectral subtraction by multi-channel LMS algorithm
We proposed a blind dereverberation method based on spectral subtraction byMulti-Channel Least Mean Square (MCLMS) algorithm for distant-talking speech recognition in our previous study [1]. In this paper, we discuss the problems of the proposed method and present some solutions. In a distant-talking environment, the length of channel impulse response is longer than the short-term spectral anal...
متن کاملSpeech Recognition by Dereverberation Method Based on Multi-channel LMS Algorithm in Noisy Reverberant Environment
1 Introduction In a distant-talking environment, channel distortion drastically degrades speech recognition performance because of mismatches between the training and test environments. The current approaches focusing on robustness issues for automatic speech recognition (ASR) in noisy reverberant environments can be classified as speech enhancement, robust feature extraction, or model adaptati...
متن کاملBlind Dereverberation Based on Generalized Spectral Subtraction by Multi-channel LMS Algorithm
A blind dereverberation method based on power spectral subtraction (SS) using a multi-channel least mean squares algorithm was previously proposed. The results of isolated word speech recognition experiments showed that this method achieved significant improvement over conventional cepstral mean normalization (CMN). In this paper, we propose a blind dereverberation method based on generalized s...
متن کاملDistant-talking speaker identification by generalized spectral subtraction-based dereverberation and its efficient computation
Previously, a dereverberation method based on generalized spectral subtraction (GSS) using multi-channel least mean-squares (MCLMS) has been proposed. The results of speech recognition experiments showed that this method achieved a significant improvement over conventional methods. In this paper, we apply this method to distant-talking (far-field) speaker recognition. However, for far-field spe...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- IEICE Transactions
دوره 94-D شماره
صفحات -
تاریخ انتشار 2011